Mathematical Model Discovery For a Psychophysics Experiment¶
In this tutorial, we will explore how to use AutoRA for automatic mathematical model discovery. We'll start by using a synthetic runner, then transition to a runner that conducts the experiment on a website. This allows us to test the experiment both locally and online. Next, we'll demonstrate how to use a participant recruiter to gather participants from Prolific.
To follow along with this tutorial, you should first set up an AutoRA project via Cookiecutter. To do so, follow the tutorial here: Closed Loop Online Experiment. Choose the advanced setting and the example "Mathematical Model Discovery." This will allow you to easily deploy a website. The generated project also contains an AutoRA script generated, which includes the finished Prolific runner version we will create step-by-step in this tutorial.
Experiment Paradigm¶
As experiment paradigm, we are going to use the Random Object Kinematogram. In our version of the experiment, the participant sees oriented triangles moving. The participant has to react to the orientation as quickly and as accurately as possible.
In this experiment, we are interested in the congruency effect: The orientation of the triangles and the movement direction can align (congruent trials) or not (incongruent trials). Typically, in such experiments, congruent trials are easier and faster than incongruent trials. Here, we are interested in how the coherence of each feature (orientation and movement) acts on the congruency effect. The coherence of the orientation is the number of triangles facing in the coherent direction as opposed to randomly facing. The coherence of the movement is the number of triangles moving in the random direction as opposed to random.
We can express the experiment in terms of dependent and independent variables:
- The dependent variable is the congruency effect. We calculate this effect as the difference in reaction times (on correct trials) between congruent and incongruent trial.
- The independent variables are movement coherence and orientation coherence
SweetPea: Counterbalanced Trial Sequences¶
From the sampled conditions (movement coherence and orientation coherence), we will create a trial sequence. A trial sequence is a list of trial features that contain more information than just the coherences. For example, the direction of the movement and orientation. Here we use SweetPea to create such trial sequences. Using SweetPea has several advantages, such as allowing us to counterbalance factors like direction and transitions. For example, we aim for congruent and incongruent conditions to appear equally often, and also to balance their transitions (congruent to congruent, congruent to incongruent, incongruent to congruent, and incongruent to incongruent). This counterbalancing ensures that our trial sequences are well-structured and that each condition and transition type is adequately represented.
AutoRA-Core: Defining the Variables¶
To define an AutoRA experiment, we first need to define the variables and a variable collection. In AutoRA, we can name the variables and define value ranges. These value ranges can be used in other AutoRA-components. For example, the allowed value range is used in grid poolers to determine a grid of possible experiment conditions.
import numpy as np
from autora.variable import Variable, VariableCollection
# ** Dependent Variable ** #
# Here, we assume the (normalized) congruency effect as a number between 0 and 1.
congruency_effect = Variable(name='congruency_effect', value_range=(0, 1))
# ** Independent Variables ** #
# Here, coherence is the ratio between coherent and random movement (or orientation) in steps of .01
coherence_movement = Variable(name='coherence_movement', value_range=(0, 1), allowed_values=np.linspace(.01, 1, 100))
coherence_orientation = Variable(name='coherence_orientation', value_range=(0, 1), allowed_values=np.linspace(.01, 1, 100))
# ** Variable Collection ** #
variables = VariableCollection(dependent_variables=[congruency_effect], independent_variables=[coherence_movement, coherence_orientation])
AutoRA-Core: State¶
An AutoRA workflow is state-based. This means each component runs on the same state object. The state object can have multiple fields that get altered via the AutoRA component. For example, an experimentalist typically populates the conditions-field, an experiment-runner typically populated the experiment-data-field, and a theorist typically populates the models-field. The user can define custom state fields, but here we use the StandardState that consists of a variable, a conditions, an experiment-data and a models-field:
from autora.state import StandardState
state = StandardState(variables=variables)
AutoRA-Component: Random Experimentalist¶
Here, we use a random experimentalist as first AutoRA component. This experimentalist only needs the variable definition and creates experimental conditions. Later in this tutorial, we will see more sophisticated experimentalists, that use active learning methods to determine experimental conditions based on model predictions.
Here, we also are introduced to the on_state and Delta mechanic. To use any function on the state, we wrap it with the on_state functionality.
# ** Grid Pooler ** #
# We create a grid of conditions (each combination of the coherence_movement and coherence_orientation
from autora.state import on_state
from autora.experimentalist.grid import grid_pool
# To wrap the grid_pool function for the use on the state, we pass the function and the field into the on_state method:
grid_pool_on_state = on_state(grid_pool, output=["conditions"])
# Run the function on the state:
state = grid_pool_on_state(state)
print(state.conditions)
# ** Random Sampler ** #
# Since we don't want our experiment to run on the full grid, we can now randomly sample from the conditions using a random sampler:
from autora.experimentalist.random import random_sample
# Define the function on the state:
random_sample_on_state = on_state(random_sample, output=["conditions"])
# Run the function on the state:
state = random_sample_on_state(state, num_samples=3)
state.conditions
Note:
This workflow (creating a grid and then pooling) is not very efficient for various reasons:
(1) Creating a grid might not be feasible if there are more independent variables or if their resolution is higher
(2) The conditions-field is overwritten by the random sampler. Therefore, each time the user wants to have different conditions, the grid has to be created again.
AutoRA provides a method to directly create random conditions, here: autora.experimentalst.random import pool. Also, we could create a custom state with the field pool and only create the grid pool once. The example above was chosen to demonstrate the ability to chain experimentalists.
AutoRA-Component: Experiment Runner (Synthetic)¶
Before running an experiment online, it is often a good idea to simulate data. This might be beneficial not only for debugging but also to sanity check predictions or computational models. Here, we use an equation runner to run the experiment on predefined equations.
The equation-runner is not part of the core AutoRA package and needs to be installed separately:
%%capture
!pip install autora-synthetic-abstract-equation
We use sympy to create an equation experiment and wrap it to run on state:
# ** Equation Runner (arbitrary) ** #
from autora.experiment_runner.synthetic.abstract.equation import equation_experiment
from sympy import symbols
# Declare our variables
coherence_movement, coherence_orientation = symbols("coherence_movement coherence_orientation")
# Declare the expression (this is arbitrary here)
expr = (coherence_movement**2)/3 + (coherence_orientation**2)/3 + coherence_movement * coherence_orientation/3
# We declare the experiment
equation_experiment_arbitrary = equation_experiment(expr, variables.independent_variables, variables.dependent_variables[0])
# Wrap this to run it on state (Note: We wrap the run function not the full experiment which includes more functionality for example to plot data)
run_eq_arbitrary_on_state = on_state(equation_experiment_arbitrary.run, output=['experiment_data'])
After defining an equation experiment on stat, we can use it just as we used the experimentalists:
state = run_eq_arbitrary_on_state(state)
state.experiment_data
%%capture
!pip install pysr
We create a theorist from a regressor by defining it and wrapping it with the on_state function so it is useable on the state:
from pysr import PySRRegressor
# The PySRRegressor doesn't use the standard sklearn output format for the predict function, here we adjust this:
class AdjustedPySRRegressor(PySRRegressor):
def predict(self, X, index=None):
y = super().predict(X, index)
if len(y.shape) < 2:
return np.array([[el] for el in y])
return y
# We define a set of mathematical operations for the symbolic regression algorithm
binary_operators = ["+", "-", "*", "/", "^"]
unary_operators = ["sin", "cos", "tan", "exp", "log", "sqrt", "abs"]
# Theorists
pysr_regressor = AdjustedPySRRegressor(niterations=100,
binary_operators=["+", "-", "*", "/", "^"],
unary_operators=["cos", "sin", "tan", "exp", "log", "sqrt"],
batching=True,
multithreading=True,
temp_equation_file=False)
# Here, we show how to use the on_state wrapper as decorator. Note, if state fields should be used as input arguments to the wrapped
# function, then the argument names have to align with the field names (here: experiment_data and variables). The same is true for the output
# Delta. Here, `models` is a field of the StandardState
from autora.state import Delta
@on_state()
def pysr_theorist_on_state(experiment_data, variables: VariableCollection):
ivs = [v.name for v in variables.independent_variables]
dvs = [v.name for v in variables.dependent_variables]
X, y = experiment_data[ivs], experiment_data[dvs]
new_model = pysr_regressor.fit(X, y)
return Delta(models=[new_model])
Now, we can run this on the state:
state = pysr_theorist_on_state(state)
We can access the current result of the equation search by accessing the last model of the state
state.models[-1].sympy()
Closed Loop: Random Sampling + Synthetic Runner + PySR¶
After creating the functions, we can now run them in a closed loop:
for _ in range(3):
state = grid_pool_on_state(state)
state = random_sample_on_state(state, num_samples=3)
state = run_eq_arbitrary_on_state(state)
state = pysr_theorist_on_state(state)
print(state.models[-1].sympy())
Active Learning - Falsification Experimentalist¶
The full potential of a closed loop is realized when active learning strategies are employed to select the conditions to probe, as opposed to random sampling. In this context, we are using a falsification sampler, which identifies novel experimental conditions where the loss of the best candidate model is predicted to be the highest. This sampler is part of the AutoRA ecosystem:
%%capture
!pip install "autora[experimentalist-falsification]"
from autora.experimentalist.falsification import falsification_sample
# Wrap to use on state:
@on_state()
def falsification_sample_on_state(conditions, models, experiment_data, variables, num_samples):
ivs = [v.name for v in variables.independent_variables]
dvs = [v.name for v in variables.dependent_variables]
X, y = experiment_data[ivs], experiment_data[dvs]
return Delta(conditions=falsification_sample(conditions, models[-1], X, y, variables, num_samples=num_samples))
# create a grid
state = grid_pool_on_state(state)
# sample 100 samples as candidates for the falsification sampler
state = random_sample_on_state(state, num_samples=100)
# sample 3 samples via the falsification sampler
state = falsification_sample_on_state(state, num_samples=3)
state.conditions
With this, we can create a closed loop with falsification sampler. Here, we alternate between random sampling and falsification sampling. Note, the falsification sampler will fail on the first cycle since there is no model or experiment_data to falsify.
# reset the state:
state = StandardState(variables)
for cycle in range(4):
state = grid_pool_on_state(state)
if not cycle % 2: # cycle is odd
state = random_sample_on_state(state, num_samples=3)
else: # cycle is even
state = random_sample_on_state(state, num_samples=100)
state = falsification_sample_on_state(state, num_samples=3)
state = run_eq_arbitrary_on_state(state)
state = pysr_theorist_on_state(state)
print(state.models[-1].sympy())
Online Experiment - Firebase Experiment Runner¶
After testing the experimental strategy and the theorist on synthetic data, we can use experiment runner that collects data from a website. Here, we assume that you have set up a website on firebase (for example by following the tutorial here: Closed Loop Online Experiment and chose the example "Mathematical Model Discovery".
Generating a Trial Sequence¶
Instead of conditions (coherence of movement and orientation), we generate trial sequences to send to Firebase. The format of the conditions depends on the experiment website. Here, we create a trial sequence and use it as timeline variable for a jsPsych experiment (see, jsPsych - timelineVariables) To generate a (counterbalanced) trial sequence, we use SweetPea. For more information on SweetPea and tutorials, see SweetPea - Tutorials.
The following code creates sequences with counterbalanced directions, counterbalanced congruency, and counterbalanced congruency transitions.
Installation:
%%capture
!pip install sweetpea
Define a function to generate trial sequences in sweetpea
from sweetpea import (
Factor, DerivedLevel, WithinTrial,
Transition, MinimumTrials, CrossBlock,
synthesize_trials, CMSGen, experiments_to_dicts
)
def condition_to_trial_sequence(coherence_movement,
coherence_orientation,
number_blocks=2,
minimum_trials=48):
# ** Regular Factors ** #
# Define factors with static levels (here, 0 and 180 degree for left and right)
direction_movement = Factor('direction_movement', [0, 180])
direction_orientation = Factor('direction_orientation', [0, 180])
# ** Derived Factors ** #
# Derived factors are derived from other factors either within (WithinTrials) or across (Transition)
# First, we define boolean functions for each level of a derived factor:
# * Within: Congruency * #
# Here, ``congruency` has two levels: `congruent` and `incongruent`
# A trial is `congruent` if the direction of the movement is equal to the direction of the orientation
def is_congruent(dir_mov, dir_or):
return dir_mov == dir_or
# A trial is `incongruent` if it is not congruent
def is_incongruent(dir_mov, dir_or):
return not is_congruent(dir_mov, dir_or)
# The level `congruent` is "calculated" by the function `is_congruent`
# by passing in the factors `direction_movement` and `direction_orientation`
congruent = DerivedLevel('congruent',
WithinTrial(is_congruent,
[direction_movement, direction_orientation])
)
# The level `incongruent` is "calculated" by the function `is_incongruent`
# by passing in the factors `direction_movement` and `direction_orientation`
incongruent = DerivedLevel('incongruent',
WithinTrial(is_incongruent,
[direction_movement, direction_orientation])
)
# The factor congruency is defined by its name and the two levels it can have
congruency = Factor('congruency', [congruent, incongruent])
# * Transition: Congruency Transition * #
# We creat the factor `congruency_transition` with four levels (all possible transitions).
# First, we define the boolean functions
# Here, we index the trials with 0 being the current and -1 being the previous trial:
def is_transition_cc(cong):
return cong[-1] == 'congruent' and cong[0] == 'congruent'
def is_transition_ci(cong):
return cong[-1] == 'congruent' and cong[0] == 'incongruent'
def is_transition_ic(cong):
return cong[-1] == 'incongruent' and cong[0] == 'congruent'
def is_transition_ii(cong):
return cong[-1] == 'incongruent' and cong[0] == 'incongruent'
# Name and define the levels with their function and the argument to pass in:
transition_cc = DerivedLevel('cc', Transition(is_transition_cc, [congruency]))
transition_ci = DerivedLevel('ci', Transition(is_transition_ci, [congruency]))
transition_ic = DerivedLevel('ic', Transition(is_transition_ic, [congruency]))
transition_ii = DerivedLevel('ii', Transition(is_transition_ii, [congruency]))
# Name and define the factor with its levels
congruency_transition = Factor('congruency_transition',
[transition_cc, transition_ci, transition_ic, transition_ii])
# Design: All the factors that define a trial
design = [direction_movement, direction_orientation, congruency, congruency_transition]
# Crossing: Which factors to cross
# In this case, the crossing of direction_movement and congruency_transition ensures
# the crossing of the other factors.
crossing = [direction_movement, congruency_transition]
# Constraints: SweetPea allows to contraint the sequence (for example how many congruent
# should be allowed in a trial). In this case, we only constraint the minimum number of trials
constraints = [MinimumTrials(minimum_trials)]
# Create a block
block = CrossBlock(design, crossing, constraints)
# Synthesize trial sequence using the CMSGen sampler (for more information about the different
# samplers, see the SweetPea documentation)
experiments = synthesize_trials(block, number_blocks, CMSGen)
# Format the experiment for the convenient use in jsPsych
sequence_list = experiments_to_dicts(block, experiments)
# For each trial, add the coherence
for sequence in sequence_list:
for trial in sequence:
trial['coherence_movement'] = coherence_movement
trial['coherence_orientation'] = coherence_orientation
return sequence_list
This function returns a list of trial sequences
condition_to_trial_sequence(.5, .3)
Postprocessing Data¶
Here, we will get the data in json format from jsPsych as list of trials with trial data and response data. We need to process this data to autora experiment_data. This is a pandas DataFrame with columns for the independent and dependent variables.
import pandas as pd
def trial_list_to_experiment_data(trial_sequence):
"""
Parse a trial sequence (from jsPsych) into dependent and independent variables.
Independent: coherence_movement, coherence_orientation.
Dependent: congruency_effect (rt_incongruent - rt_congruent) / max_response_time.
"""
# Initialize a dictionary to collect trial data
res_dict = {
'coherence_movement': [],
'coherence_orientation': [],
'congruency': [],
'rt': []
}
# Loop through each trial in the sequence
for trial in trial_sequence:
# Filter out non-relevant trials (not 'rok') and incorrect responses
if trial['trial_type'] != 'rok' or 'correct' not in trial or not trial['correct']:
continue
# Determine congruency
congruency = trial['coherent_movement_direction'] == trial['coherent_orientation']
# Normalize coherence values
coherence_movement = trial['coherence_movement'] / 100.0
coherence_orientation = trial['coherence_orientation'] / 100.0
# Extract reaction time
rt = trial['rt']
# Append values to the dictionary
res_dict['congruency'].append(int(congruency))
res_dict['coherence_movement'].append(coherence_movement)
res_dict['coherence_orientation'].append(coherence_orientation)
res_dict['rt'].append(rt)
# Convert the dictionary to a DataFrame
dataframe_raw = pd.DataFrame(res_dict)
# Group by coherence_movement and coherence_orientation
grouped = dataframe_raw.groupby(['coherence_movement', 'coherence_orientation'])
# Calculate mean reaction times for congruent and incongruent trials
mean_rt = grouped.apply(lambda x: pd.Series({
'mean_rt_congruent': x[x['congruency'] == 1]['rt'].mean(),
'mean_rt_incongruent': x[x['congruency'] == 0]['rt'].mean()
}, index=['mean_rt_congruent', 'mean_rt_incongruent']), include_groups=False).reset_index()
# Calculate the congruency effect
max_response_time = 2000 # Assuming max response time is 2000ms
mean_rt['congruency_effect'] = (mean_rt['mean_rt_incongruent'] - mean_rt['mean_rt_congruent']) / max_response_time
# Return the relevant columns
return mean_rt[['coherence_movement', 'coherence_orientation', 'congruency_effect']]
For example:
jsPsych_trial_data = [
{'rt': 1500, 'correct': 1, 'trial_type': 'rok',
'coherent_movement_direction': 0, 'coherent_orientation': 180,
'coherence_movement': 20, 'coherence_orientation': 80},
{'rt': 500, 'correct': 0, 'trial_type': 'rok',
'coherent_movement_direction': 180, 'coherent_orientation': 0,
'coherence_movement': 20, 'coherence_orientation': 80},
{'rt': 900, 'correct': 1, 'trial_type': 'rok',
'coherent_movement_direction': 180, 'coherent_orientation': 180,
'coherence_movement': 20, 'coherence_orientation': 80},
{'rt': 700, 'correct': 1, 'trial_type': 'rok',
'coherent_movement_direction': 0, 'coherent_orientation': 0,
'coherence_movement': 20, 'coherence_orientation': 80}
]
trial_list_to_experiment_data(jsPsych_trial_data)
Runner¶
With these two functions in place, we can create a firebase runner. Here, we also need Firebase credentials (found in the project settings of your firebase project.
Credentials:
firebase_credentials = {
"type": "type",
"project_id": "project_id",
"private_key_id": "private_key_id",
"private_key": "private_key",
"client_email": "client_email",
"client_id": "client_id",
"auth_uri": "auth_uri",
"token_uri": "token_uri",
"auth_provider_x509_cert_url": "auth_provider_x509_cert_url",
"client_x509_cert_url": "client_x509_cert_url"
}
Install the runner
%%capture
!pip install "autora[experiment-runner-firebase-prolific]"
Initialize the runner and wrap it on state:
import json
from autora.experiment_runner.firebase_prolific import firebase_runner
experiment_runner_firebase = firebase_runner(
firebase_credentials=firebase_credentials,
time_out=100,
sleep_time=5)
@on_state()
def run_firebase_on_state(conditions, num_blocks, num_trials):
trial_sequences = []
# Loop through the rows of the condition DataFrame and create a trial sequence for each
for _, row in conditions.iterrows():
trial_sequences.append(condition_to_trial_sequence(
row['coherence_movement'],
row['coherence_orientation'],
num_blocks,
num_trials))
# Run the experiment online and wait for the data of num_samples participants
# (head over to the website given in the firebase console and try the experiment)
data_raw = experiment_runner_firebase(trial_sequences)
# initialize experiment data as DataFrame
experiment_data = pd.DataFrame()
# data raw is a list of trial sequences from jsPsych including all trials, also fixation cross, instructions, ... for each participant
for item in data_raw:
# process the data
_lst = json.loads(item)['trials']
_df = trial_list_to_experiment_data(_lst)
# append the data to the experiment data
experiment_data = pd.concat([experiment_data, _df], axis=0)
return Delta(experiment_data=experiment_data)
Now, we can replace the synthetic runner from earlier with the online firebase runner:
state = StandardState(variables=variables)
for cycle in range(4):
state = grid_pool_on_state(state)
if not cycle % 2: # cycle is odd
state = random_sample_on_state(state, num_samples=3)
else: # cycle is even
state = random_sample_on_state(state, num_samples=100)
state = falsification_sample_on_state(state, num_samples=3)
# state = run_eq_arbitrary_on_state(state)
state = run_firebase_on_state(state, num_blocks=2, num_trials=48)
state = pysr_theorist_on_state(state)
print(state.models[-1].sympy())
Online Experiment - Prolific Recruitment Experiment Runner¶
After testing the online experiment with the firebase runner, we can add prolific recruitment. For this, we need to make sure that the experiment is deployed online, and we have an url for it. We also will need an api token from Prolific and enough money on our account to run the experiment.
Instead of using the firebase runner, we use the firebase_prolific_runner which expects additional argument:
firebase_credentials: a dict with firebase service account credentials sleep_time: the time between checks to the firebase database and updates of the prolific experiment study_name: a name for the study showing up in prolific study_description: a description for the study showing up in prolific study_url: the url to your experiment study_completion_time: the average completion time for a participant to complete the study prolific_token: api token from prolific
from autora.experiment_runner.firebase_prolific import firebase_prolific_runner
YOUR_PROLIFIC_TOKEN = 'your_token'
experiment_runner_prolific = firebase_prolific_runner(
firebase_credentials=firebase_credentials,
sleep_time=10,
study_name='Triangle Chaos',
study_description='Reaction time experiment in under 1 minute.',
study_url='your_url',
study_completion_time= 1,
prolific_token=YOUR_PROLIFIC_TOKEN
)
# The run function is the same with a different runner:
@on_state()
def run_prolific_on_state(conditions, num_blocks, num_trials):
trial_sequences = []
for _, row in conditions.iterrows():
trial_sequences.append(condition_to_trial_sequence(
row['coherence_movement'],
row['coherence_orientation'],
num_blocks,
num_trials))
data_raw = experiment_runner_prolific(trial_sequences)
experiment_data = pd.DataFrame()
for item in data_raw:
_lst = json.loads(item)['trials']
_df = trial_list_to_experiment_data(_lst)
experiment_data = pd.concat([experiment_data, _df], axis=0)
return Delta(experiment_data=experiment_data)
Now, we can replace the firebase runner from earlier with the online firebase runner with prolific recruitment. Here, we also store the state to save the data on disk:
from autora.serializer import dump_state
from pathlib import Path
state = StandardState(variables=variables)
for cycle in range(4):
state = grid_pool_on_state(state)
if not cycle % 2: # cycle is odd
state = random_sample_on_state(state, num_samples=3)
else: # cycle is even
state = random_sample_on_state(state, num_samples=100)
state = falsification_sample_on_state(state, num_samples=3)
# state = run_firebase_on_state(state, num_blocks=2, num_trials=48)
state = pysr_theorist_on_state(state)
dump_state(state, Path('./prolific_state.pkl'))
print(state.models[-1].sympy())